A novel ab initio identification system of transcriptional regulation motifs in genome DNA sequences based on direct comparison scheme of signal/noise distributions

نویسندگان

  • Ryo Nakaki
  • Jiyoung Kang
  • Masaru Tateno
چکیده

A novel ab initio parameter-tuning-free system to identify transcriptional factor (TF) binding motifs (TFBMs) in genome DNA sequences was developed. It is based on the comparison of two types of frequency distributions with respect to the TFBM candidates in the target DNA sequences and the non-candidates in the background sequence, with the latter generated by utilizing the intergenic sequences. For benchmark tests, we used DNA sequence datasets extracted by ChIP-on-chip and ChIP-seq techniques and identified 65 yeast and four mammalian TFBMs, with the latter including gaps. The accuracy of our system was compared with those of other available programs (i.e. MEME, Weeder, BioProspector, MDscan and DME) and was the best among them, even without tuning of the parameter set for each TFBM and pre-treatment/editing of the target DNA sequences. Moreover, with respect to some TFs for which the identified motifs are inconsistent with those in the references, our results were revealed to be correct, by comparing them with other existing experimental data. Thus, our identification system does not need any other biological information except for gene positions, and is also expected to be applicable to genome DNA sequences to identify unknown TFBMs as well as known ones.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

smyRNA: A Novel Ab Initio ncRNA Gene Finder

BACKGROUND Non-coding RNAs (ncRNAs) have important functional roles in the cell: for example, they regulate gene expression by means of establishing stable joint structures with target mRNAs via complementary sequence motifs. Sequence motifs are also important determinants of the structure of ncRNAs. Although ncRNAs are abundant, discovering novel ncRNAs on genome sequences has proven to be a h...

متن کامل

Reliable prediction of regulator targets using 12 Drosophila genomes.

Gene expression is regulated pre- and post-transcriptionally via cis-regulatory DNA and RNA motifs. Identification of individual functional instances of such motifs in genome sequences is a major goal for inferring regulatory networks yet has been hampered due to the motifs' short lengths that lead to many chance matches and poor signal-to-noise ratios. In this paper, we develop a general metho...

متن کامل

Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies

Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive...

متن کامل

In silico structural analysis of quorum sensing genes in Vibrio fischeri

Quorum sensing controls the luminescence of Vibrio fischeri through the transcriptional activator LuxR and the specific autoinducer signal produced by luxI. Amino acid sequences of these two genes were analyzed using bioinformatics tools. LuxI consists of 193 amino acids and appears to contain five α-helices and six ß-sheets when analyzed by SSpro8. LuxI belongs to the autoinducer synthetase fa...

متن کامل

Design of nonlinear parity approach to fault detection and identification based on Takagi-Sugeno fuzzy model and unknown input observer in nonlinear systems

In this study, a novel fault detection scheme is developed for a class of nonlinear system in the presence of sensor noise. A nonlinear Takagi-Sugeno fuzzy model is implemented to create multiple models. While the T-S fuzzy model is used for only the nonlinear distribution matrix of the fault and measurement signals, a larger category of nonlinear systems is considered. Next, a mapping to decou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2012